Refine your search:     
Report No.
 - 
Search Results: Records 1-1 displayed on this page of 1
  • 1

Presentation/Publication Type

Initialising ...

Refine

Journal/Book Title

Initialising ...

Meeting title

Initialising ...

First Author

Initialising ...

Keyword

Initialising ...

Language

Initialising ...

Publication Year

Initialising ...

Held year of conference

Initialising ...

Save select records

Oral presentation

Porting a state-of-the-art communication avoiding Krylov subspace solver on P100 GPUs

Ali, Y.*; Ina, Takuya*; Onodera, Naoyuki; Idomura, Yasuhiro

no journal, , 

Krylov subspace solvers for the pressure Poisson equation occupy $$sim 90%$$ of the total computing cost in extreme scale multi-phase CFD simulation. To accelerate the Poisson solver, we port a Chebyshev Basis communication-avoiding Conjugate Gradient (CBCG) solver with block Jacobi (BJ) preconditioning on P100 GPUs. The CBCG solver consists of BJ preconditioning, Sparse Matrix Vector product (SpMV), and Tall-Skinny matrix operations. We re-design the BJ-preconditioner for thread-block parallelization and efficient coalescing data load, and apply batched gemm to the Tall-Skinny matrix operations. By these optimization, all main kernels achieved $$sim 90%$$ of the theoretical performance based on roofline estimation, and an order of magnitude speedup of the single node performance was obtained against CPU nodes.

1 (Records 1-1 displayed on this page)
  • 1